Methods and kits for identifying pre-cancerous colorectal polyps and colorectal cancer

ABSTRACT

Methods and kits for identifying a subject having pre-cancerous advanced polyps or colorectal cancer based on the expression profile(s) of specific mRNA biomarkers. Methods and kits for diagnosing, preventing, managing therapy, monitoring and identifying predisposition to colorectal cancer.

RELATED APPLICATIONS

This application is a Continuation of Ser. No. 15/300,859 filed on Sep. 30, 2016, which is a National Phase of PCT Patent Application No. PCT/IL2015/050362 having International filing date of Apr. 2, 2015, which claims the benefit of priority of U.S. Provisional Application No. 61/977,636 filed on Apr. 10, 2014. The contents of the above applications are all incorporated by reference as if fully set forth herein in their entirety.

FIELD OF THE INVENTION

The present invention relates, according to some embodiments, to methods and kits for identifying a subject having pre-cancerous advanced polyps or colorectal cancer based on the expression profile(s) of specific mRNA biomarkers. The present invention further comprises methods and kits for diagnosing, preventing, managing therapy, monitoring and identifying predisposition to colorectal cancer.

BACKGROUND OF THE INVENTION

Colorectal cancer (CRC) is one of the most common cancers accounting for approximately 10% of all cancer cases and approximately 8% of all cancer deaths. Solid cancers are normally diagnosed based on a histo pathological tissue evaluation, where the gold standard for CRC is fiber-optic colonoscopy. This technology is labor intensive, time consuming, costly and extremely invasive. The alternative of fecal occult blood test (FOBT), while not as invasive, is known to suffer from low sensitivity.

Screening and monitoring assays are essential for early detection and management of cancer. Blood-based tests enable large-scale screening of clinically asymptomatic (supposedly healthy) individuals, for diagnosis, monitoring and prediction of cancer. Furthermore, blood-based sampling is prevalent and convenient, and therefore may increase compliance in asymptomatic populations.

Bonilla et al. (Oncology Letters, 2, 719-714, 2011) disclose mRNA biomarkers associated with poor outcome in patients suffering from advanced stages of colorectal cancer.

Comprehensive lists of hundreds of genes that may be associated with colorectal cancer were disclosed, for example, in Ye et al., Plos one, 2013; 8 (5), e62870; and Garcia et al., Clinical Chem. 53 (10): 1860-1863, 2007. Marshall et al. (Int J Cancer 2010; 126: 1177-1186) disclose a biomarker for CRC based on RNA extracted from peripheral blood cells corresponding to a panel of seven genes: ANXA3, CLEC4D, LMNB1, PRRG4, TNFAIP6, VNN1 and IL2RB.

US 2010/0330079 discloses a method for the detection of protein biomarkers for early diagnosis and management of colorectal cancer. The method includes obtaining quantitative information about the expression of 51 genes in peripheral blood.

WO 2011/012136 discloses a method for discriminating between CRC and non-cancerous samples based on the expression level of a group of miRNAs.

There is an unmet need for cost-effective, rapid, accurate and minimally invasive methods and kits for early detection and treatment of pre-cancerous advanced polyps and colorectal cancer, with improved sensitivity and specificity.

SUMMARY OF THE INVENTION

The present invention provides methods and kits for identifying colon cancer and precancerous polyps in a subject. Advantageously, the methods and kits of the invention differentiate a colon having precancerous advanced polyps from colorectal cancer, based on a non-invasive molecular based analysis. Moreover, the methods and kits of the invention provide a diagnostic platform with high sensitivity (at least 60%) and high specificity (above 85%).

The present invention is premised on the discovery that disease-associated biomarkers can be identified in plasma or other bodily fluids long before an overt disease is apparent. Another advantage conferred by the biomarkers of the present invention arises from the fact that the biomarkers are extracellular, thereby originate from all body tissues. Moreover, these biomarkers are not affected by the immune response. The presence or absence of these biomarkers from the plasma footprints of patients suffering from colorectal cancer is provided herein as early diagnostic tools, for which treatment strategies can be devised and administered to prevent, delay or reverse the formation of neoplastic colorectal cells. One or combinations of several of the disease-associated biomarkers of the present invention are useful to diagnose subjects suffering from precancerous advanced polyps or colorectal cancer, or advantageously, to diagnose those subjects who are asymptomatic for colorectal cancer.

Surprisingly, as demonstrated herein, the methods of the invention use the expression profile of a finite number of nucleic acid sequences biomarkers to identify a healthy subject, a subject having colorectal cancer and a subject having precancerous advanced polyps. Furthermore, the biomarkers of the invention are identified in plasma specimen, which is remote from the site of disease. Unexpectedly, said plasma based biomarkers provide a differentially expressed gene profile which correlates at high specificity and high sensitivity with the pathology examination report.

According to some embodiments, there is provided a method for identifying a subject having colorectal cancer or precancerous advanced colorectal polyps, the method comprising:

(a) providing a biological sample from a subject;

(b) measuring the expression levels of a biomarker comprising a nucleic acid sequences set forth in SEQ ID NO: 1 in said biological sample; and

(c) identifying an expression level of said biomarker above a cutoff value for said biomarker, thereby identifying said subject as having colorectal cancer or precancerous advanced colorectal polyps.

According to some embodiments, said biomarker comprises SEQ ID NO: 1 and further comprises at least one nucleic acid sequences selected from SEQ ID NOs: 2, 3, 5-7, 12 and 17. Each possibility is a separate embodiment of the present invention.

According to some embodiments, said biomarker comprises the nucleic acid sequences set forth in SEQ ID NOs: 1-3, 5-7, 12 and 17 and said subject is identified as having colorectal cancer.

According to some embodiments, said biomarker is consisting of the nucleic acid sequences set forth in SEQ ID NOs: 1-3, 5-7, 12 and 17.

According to some embodiments, said biomarker further comprises the nucleic acid sequences set forth in SEQ ID NOs:1 and 5 and said subject is identified as having precancerous advanced colorectal polyps.

According to some embodiments, said biomarker is consisting the nucleic acid sequences set forth in SEQ ID NOs: 1 and 5.

According to some embodiments, said biomarker comprises SEQ ID NO: 1 and further comprises at least one nucleic acid sequences selected from SEQ ID NOs: 3, 4, 6 and 14. Each possibility is a separate embodiment of the present invention.

According to some embodiments, said biomarker comprises SEQ ID NOs: 1 and 4 and at least one nucleic acid sequences selected from SEQ ID NOs: 3, 6 and 14. Each possibility is a separate embodiment of the present invention.

According to some embodiments, said biomarker comprises SEQ ID NOs: 1, 3 and 4.

According to some embodiments, said biomarker comprises SEQ ID NOs: 1, 4, 6 and 14.

According to some embodiments, said biomarker comprises SEQ ID NOs: 1, 3, 4 and 14.

According to some embodiments, said biological sample is selected from the group consisting of blood, plasma, saliva, serum or a combination thereof. Each possibility is a separate embodiment of the present invention.

According to some embodiments, said biological sample is plasma extracted from peripheral blood.

According to some embodiments, the biomarker is circulating mRNA.

According to some embodiments, measuring the expression of said biomarker comprises at least one nucleic acid analysis technique selected from: polymerase chain reaction (PCR), quantitative PCR, nucleic acid sequencing technology, restriction digestion, specific hybridization, single stranded conformation polymorphism assays (SSCP) and electrophoretic analysis. Each possibility is a separate embodiment of the present invention.

According to some embodiments, measuring the expression of said biomarker comprises extracting mRNA from the plasma, reverse transcribing said mRNA into cDNA and measuring the expression level of said cDNA using quantitative-PCR.

According to some embodiments, there is provided a method for identifying a subject having colorectal cancer or precancerous advanced colorectal polyps, the method comprising:

(a) providing a biological sample from the subject;

(b) measuring the expression levels of a biomarker comprising a nucleic acid sequences set forth in SEQ ID NO: 2 in said biological sample; and

(c) identifying an expression level of said biomarker above a cutoff value for said biomarker, thereby identifying said subject as having colorectal cancer or precancerous advanced colorectal polyps.

According to some embodiments, said biomarker comprises SEQ ID NO: 2 and further comprises at least one nucleic acid sequences selected from SEQ ID NOs: 1, 3, 5-7, 12 and 17. Each possibility is a separate embodiment of the present invention.

According to some embodiments, there is provided a method for identifying a subject having colorectal cancer or precancerous advanced colorectal polyps, the method comprising:

(a) providing a biological sample from the subject;

(b) measuring the expression levels of a biomarker comprising a plurality of nucleic acid sequences, said plurality comprises SEQ ID NO: 1 and at least one nucleic acid sequences selected from SEQ ID NOs: 2, 3, 5-7, 12 and 17 in said biological sample; and

(c) identifying an expression level of said biomarker above a cutoff value for said biomarker, thereby identifying said subject as having colorectal cancer or precancerous advanced colorectal polyps.

According to some embodiments, there is provided a method for identifying a subject having colorectal cancer or precancerous advanced colorectal polyps, the method comprising:

(a) providing a biological sample from the subject;

(b) measuring the expression levels of a biomarker comprising a plurality of nucleic acid sequences, said plurality comprises SEQ ID NOs: 6, 9 and 14; and

(c) identifying an expression level of said biomarker above a cutoff value for said biomarker, thereby identifying said subject as having colorectal cancer or precancerous advanced colorectal polyps.

According to some embodiments, the method further comprises providing the cutoff value for said biomarker. According to some embodiments, the method further comprises providing the cutoff value for each nucleic acid sequence corresponding to the biomarker. According to some embodiments, the method further comprises providing the cutoff value for the plurality of nucleic acid sequences corresponding to the biomarkers.

According to some embodiments, the method further comprises treating the subject having colorectal cancer or precancerous advanced colorectal polyps.

According to some embodiments, treating comprises at least one of administering a chemotherapeutic agent, performing bowel resection, applying radiation therapy and a combination thereof. Each possibility is a separate embodiment of the present invention.

According to some embodiments, the chemotherapeutic agent is selected from the group consisting of: 5-fluorouraeil, leucovorin, oxaliplatin, capecitabine and a combination thereof. Each possibility is a separate embodiment of the present invention.

According to some embodiments, there is provided a kit for identifying a subject having colorectal cancer, the kit comprising: (a) means for measuring the expression level of a biomarker comprising at least one nucleic acid sequences selected from the group consisting of SEQ ID NO: 1 to 17 in a biological sample obtained from a subject; and (b) means for determining a cutoff value for said at least biomarker or information regarding the cutoff value of said at least one biomarker, wherein an expression level of the at least one biomarker above said cutoff value identifies said subject as having colorectal cancer.

According to some embodiments, the means for measuring the expression levels of said biomarker are at least one oligonucleotide capable of amplifying at least one nucleic acid sequences selected from the group consisting of SEQ ID NO: 1 to 17, at least one oligonucleotide capable of hybridizing to said at least one nucleic acid sequence, a nucleotide primer pair flanking the at least one nucleic acid sequence and a combination thereof.

According to some embodiments, the at least one oligonucleotide capable of hybridizing to said at least one nucleic acid sequence comprises a detectable label.

According to some embodiments, the detectable label produces a signal that correlates with the expression level of said at least one biomarker.

According to some embodiments, the detectable label produces an optical signal.

According to some embodiments, said means is a nucleotide primer pair flanking the at least one nucleic acid sequence and the nucleotide primer pair comprises a detectable label.

According to some embodiments, the kit further comprising instructions of use thereof for identifying a subject having colorectal cancer.

Further embodiments, features, advantages and the full scope of applicability of the present invention will become apparent from the detailed description and drawings given hereinafter. However, it should be understood that the detailed description, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows an embodiment of the experimental procedures that are described in the examples below.

FIGS. 2A-2B depict concentration calibration curves of the primers for each of the house keeping genes HPRT1 (FIG. 2A) and TFRC (FIG. 2B).

FIGS. 3A-3F depict pie charts of true positive percentages (sensitivity) of subjects having colorectal cancer (Cancer), subjects having precancerous advanced polyps (Advanced Polyp) and the false positive percentage (one (1) minus specificity) of healthy (Normal) subpopulation for 6 different biomarkers: BAD (FIG. 3A; SEQ ID NO: 2), BAMBI (FIG. 3B; SEQ ID NO: 3), NEK6 (FIG. 3C; SEQ ID NO: 5), FKBP5 (FIG. 3D; SEQ ID NO: 7), EPAS1 (FIG. 3E; SEQ ID NO: 6) and CHD2 (FIG. 3F; SEQ ID NO: 1).

FIGS. 4A-4B exhibit the normalized expression levels (each column refers to a single subject) of two biomarker combinations: (FIG. 4A) COX11, KIAA1199 and BAD; and (FIG. 4B) CHD2 and EPAS1, in healthy (Normal-textured grey) subjects, subjects having precancerous advanced polyps (Precancerous-solid grey) and subjects having colorectal cancer (Cancer-solid black).

FIG. 5 is a ROC analysis for the maximum values of the biomarkers BAD; BAMBI; CHD2; FKBP5; SASH3; NEK6; EPAS1 and KLF9 (SEQ ID NOs: 2, 3, 1, 7, 17, 5, 6, and 12, respectively, and AUC of cluster-model in healthy (Control) and cancer (CA) yielding sensitivity of 75% and specificity of 93%.

FIG. 6 shows sample distribution, corresponding to the markers of FIG. 5, of cluster-model healthy (Control) and cancer (CA), with the dashed line denoting specificity above 85% and Max Youden index point (0.84).

FIG. 7 is a ROC analysis for the maximum values of the biomarkers BAD and NEK6, and AUC of cluster-model in healthy (Control) and precancerous (AD) yielding sensitivity of 60% and specificity of 87%.

FIG. 8 shows sample distribution, corresponding to the markers of FIG. 7, of cluster-model healthy (Control) and precancerous (AD), with the dashed line denoting specificity above 85% and Max Youden index point (2).

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides biomarkers and combinations thereof, applied for identifying precancerous advanced polyps and colorectal cancer.

The present invention thus concerns biomarkers and biomarker combinations and methods for analyzing plasma biomarkers implicated in precancerous advanced polyps and colorectal cancer. The biomarker of the invention includes one or more mRNA segments corresponding to 17 genes, set forth in SEQ ID NOs: 75-91, or fragments thereof, including the gene fragments set forth in SEQ ID NOs: 1-17.

The disclosed methods, kits, biomarkers and biomarker combinations of the present invention are designed to screen and identify colorectal cancer preferably with sensitivity equals or superior to 60% and specificity equals or superior to 85%.

In general, the methods of the present invention are useful for obtaining biomarker profiles and quantitative information about the expression of many different genes related to diagnosis, including early diagnosis, of precancerous advanced polyps and colorectal cancer in a blood sample.

The level of biomarkers may be measured electrophoretically or immunochemically, wherein the immunochemical detection may be achieved by radioimmunoassay, immunofluorescence assay or by an enzyme-linked immunosorbent assay. In some embodiments, the level of biomarkers is measured by qPCR.

Current molecular diagnostics for CRC have not been sensitive enough to distinguish precancerous advanced polyps from colorectal cancer. About 60% of patients are first diagnosed with late stage disease. Consequently, about $14B are spent annually on treatments and management of CRC patients in the US.

Thus, the diagnostic platform provided herein, offering high specificity and high sensitivity, yet low cost and improved patient compliance, overcomes the deficiencies of the current CRC diagnostics.

According to some embodiments, there is provided a method for identifying a subject having colorectal cancer or precancerous advanced colorectal polyps, the method comprising:

(a) providing a biological sample from a subject;

(b) measuring the expression levels of a biomarker comprising a nucleic acid sequences set forth in SEQ ID NO: 1 in said biological sample; and

(c) identifying an expression level of said biomarker above a cutoff value for said biomarker, thereby identifying said subject as having colorectal cancer or precancerous advanced colorectal polyps.

According to some embodiments, there is provided a method for identifying a subject having colorectal cancer or precancerous advanced colorectal polyps, the method comprising:

(a) providing a biological sample from the subject;

(b) measuring the expression levels of a biomarker comprising a nucleic acid sequences set forth in SEQ ID NO: 2 in said biological sample; and

(c) identifying an expression level of said biomarker above a cutoff value for said biomarker, thereby identifying said subject as having colorectal cancer or precancerous advanced colorectal polyps.

According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 3. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 4. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 5. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 6. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 7. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 8. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 9. According to some embodiments the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 10. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 11. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 12. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 13. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 14. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 15. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 16. According to some embodiments, the biomarker comprises the nucleic acid sequence set forth in SEQ ID NO: 17.

According to some embodiments, the biomarker comprises a plurality of nucleic acid sequences selected from SEQ ID NO: 1-17. According to some embodiments, the method comprises measuring the expression levels of the biomarker and determining a cutoff value for each nucleic acid sequence selected from SEQ ID NO: 1-17, wherein an expression level of at least one nucleic acid sequence of said plurality above the cutoff value indicates that said subject is having colorectal cancer or precancerous advanced colorectal cancer.

According to some embodiments, said biomarker comprises the nucleic acid sequences set forth in SEQ ID NO: 1 and further comprise at least one of SEQ ID NOs: 2-3, 5-7, 12 and 17 and said subject is identified as having colorectal cancer.

According to some embodiments, said biomarker comprises the nucleic acid sequences set forth in SEQ ID NOs: 1-3, 5-7, 12 and 17 and said subject is identified as having colorectal cancer.

According to some embodiments, said biomarker is consisting of the nucleic acid sequences set forth in SEQ ID NOs: 1-3, 5-7, 12 and 17.

According to some embodiments, said biomarker comprises SEQ ID NO: 1 and SEQ ID NO: 5.

According to some embodiments, said biomarker is consisting of SEQ ID NO: 1 and SEQ ID NO: 5.

According to some embodiments, said biomarker comprises SEQ ID NO: 1 and SEQ ID NO: 3. According to some embodiments, said biomarker comprises SEQ ID NO: 1 and SEQ ID NO: 4. According to some embodiments, said biomarker comprises SEQ ID NO: 1 and SEQ ID NO: 6. According to some embodiments, said biomarker comprises SEQ ID NO: 1 and SEQ ID NO: 14. According to some embodiments, said biomarker comprises SEQ ID NO: 1, SEQ ID NO: 3 and SEQ ID NO: 4. According to some embodiments, said biomarker comprises SEQ ID NO: 1, SEQ ID NO: 3 and SEQ ID NO: 6. According to some embodiments, said biomarker comprises SEQ ID NO: 1, SEQ ID NO: 3 and SEQ ID NO: 14. According to some embodiments, said biomarker comprises SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 6 and SEQ ID NO: 14. According to some embodiments, said biomarker comprises SEQ ID NO: 1, SEQ ID NO: 4 and SEQ ID NO: 6. According to some embodiments, said biomarkers comprise SEQ ID NO: 1, SEQ ID NO: 4, SEQ ID NO: 6 and SEQ ID NO: 14. According to some embodiments, said biomarker comprises SEQ ID NO: 1, SEQ ID NO: 6 and SEQ ID NO: 14. According to some embodiments, said biomarker comprises SEQ ID NO: 6 and SEQ ID NO: 9. According to some embodiments, said biomarker comprises SEQ ID NO: 6 and SEQ ID NO: 14. According to some embodiments, said biomarker comprises SEQ ID NO: 9 and SEQ ID NO: 14. According to some embodiments, said biomarker comprise SEQ ID NO: 6, SEQ ID NO: 9 and SEQ ID NO: 14. According to some embodiments, said biomarker is consisting of any of the aforementioned combinations.

According to some embodiments, the present invention provides a method for identifying a subject having pre-cancerous advanced colorectal polyps comprising: obtaining a biological sample from the subject; measuring the expression levels a biomarker comprising at least one nucleic acid sequence selected from the group set forth in SEQ ID NO: 1 to 17 (Table 1B) in said biological sample; and determining an expression level of said at least one nucleic acid sequence above its cutoff value thereby identifying the subject as having pre-cancerous advanced colorectal polyps or colon cancer.

According to some embodiments, determining an expression level of SEQ ID NO: 1 below the cutoff value of SEQ ID NO: 1, an expression level of at least one first biomarker below the cutoff value of said at least one first biomarker and an expression level of at least one second biomarker above a the cutoff value of said at least one second biomarker identifies the subject as having pre-cancerous advanced colorectal polyps, wherein said first biomarker is any one or more of SEQ ID NOs: 3-8 and 10-13 and 15-17 and wherein said second biomarker comprises at least one of SEQ ID NOs: 2, 9 and 14. Each possibility is a separate embodiment of the present invention.

According to some embodiments, said second biomarker comprises SEQ ID NO: 2. According to some embodiments, said second biomarker comprises SEQ ID NO: 9. According to some embodiments, said second biomarker comprises SEQ ID NO: 14. According to some embodiments, said second biomarker comprises SEQ ID NOs: 2 and 9. According to some embodiments, said second biomarker comprises SEQ ID NOs: 2 and 14. According to some embodiments, said second biomarker comprises SEQ ID NOs: 9 and 14.

According to some embodiments, the terms “precancerous advanced polyps”, “precancerous”, “advanced adenoma”, “AD”, “AA”, and “polyps”, as used herein, are interchangeable and refer to a colorectal polyp, neoplastic pre-cancerous lesions or other abnormal tissue growth or lesion that is likely to develop into a malignant tumor or adenomatous polyps. It has been shown that detection of precancerous advanced polyps lowers the incidence and mortality from CRC. In fact, around 85% of CRCs are sporadic and developed from adenomas.

According to some embodiments, adenomas that are larger than 1 cm, or those with severe dysplasia or a villous architecture are referred to as “advanced adenomas” and are generally considered to be the most relevant subset to detect in screening. The development of CRC from adenoma is estimated to require 5 to 10 years. Since most CRC cases develop from precancerous lesions, screening has substantial clinical benefits to patients.

According to some embodiments, a “biomarker” includes, but is not limited to, one or more of: a molecular indicator of a specific biological property; a biochemical feature or fact that can be used to detect colorectal cancer. Commonly, “biomarker” encompasses, without limitation, proteins, nucleic acids, and metabolites, together with their polymorphisms, mutations, variants, modifications, subunits, fragments, protein-ligand complexes, and degradation products, protein-ligand complexes, elements, related metabolites, electrolytes, elements, and other analytes or sample-derived measures. Biomarkers may also include mutated proteins or mutated nucleic acids. Biomarkers may also refer to non-analyte physiological markers of health status encompassing other clinical characteristics or risk factors of colorectal cancer such as, without limitation, age, ethnicity, and family history of cancer.

As used herein, the term “biomarker” refers to a nucleic acid sequence of a gene or a fragment thereof the expression of which is indicative of colon cancer or precancerous advanced colorectal polyps. The biomarker may be an mRNA or the cDNA corresponding thereto, which represent the gene or a fragment thereof. The biomarker comprise any one or more of SEQ ID NOs: 1-17. According to some embodiments, the biomarker comprises any one or more of SEQ ID NOs: 75-91 or fragments thereof, including but not limited to, any one or more of SEQ ID NOs: 1-17.

According to some embodiments, the terms “nucleic acid sequence”, and “polynucleotide”, as used herein, are used interchangeably, and include polymeric forms of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. The following are non-limiting examples of polynucleotides: a gene or gene fragment, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component. The term also includes both double- and single-stranded molecules.

RNA is highly labile, easily degradable, and therefore not likely to be stable or detectable outside of the protective cellular environment. However, RNA expression which is highly regulated in normal state becomes increasingly dysregulated in a pathological state, such as, cancer. Therefore, profiling RNA expression is useful for identifying cancer type and stage.

Moreover, use of circulating RNA from the plasma for the analysis of cancer is highly attractive for a number of reasons:

-   -   (a) sampling requires a minimally invasive method (extraction of         a small amount of blood);     -   (b) sampling can be obtained repeatedly and at any time during         tumor progression, allowing for analyzing response to treatment;     -   (c) the overall simplicity makes it appropriate for use in the         asymptomatic population at risk; and     -   (d) a correlation was noted between circulating tumor cells and         circulating tumor mRNA in colon cancer, and it was found that         mRNA is more sensitive than DNA in the plasma of breast cancer         patients.

According to some embodiments, the nucleic acid sequence representing the biomarker is circulating mRNA.

According to some embodiments, the term “circulating” refers to segments of nucleic acids found in the bloodstream.

According to some embodiments, the nucleic acid sequence representing the biomarker is a cDNA corresponding to circulating mRNA.

As used herein, the term “cDNA” refers to complementary DNA. According to some embodiments, cDNA refers to an isolated polynucleotide, nucleic acid molecule, or any fragment or complement thereof. According to some embodiments the cDNA is obtained by recombinant techniques or synthesized synthetically, may be double-stranded or single-stranded, representing coding and/or non-coding 5′ and 3′ sequences.

According to some embodiments, an “analyte” as used herein refers to any substance to be measured and optionally, utilized, for identifying subpopulations having certain disease or disorder. Stated otherwise, a biomarker (analyte) may be a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention.

According to some embodiments, the term “colon cancer” refers to cancers and/or neoplasms that form in the tissues of the colon (the longest part of the large intestine). Typically, colon cancers are adenocarcinomas (cancers that are initiated in cells that produce and release mucus and other fluids).

According to some embodiments, the term “rectal cancer” refers to cancers and/or neoplasms that form in the tissues of the rectum (the last several inches of the large intestine preceding the anus).

According to some embodiments, the term “colorectal cancer” in the context of the present invention includes, but is not limited to, cancer arises in either the colon or the rectum.

The present invention is based, in part, on the unexpected discovery that a distinct biomarker and a distinct set of biomarkers in a fluid (blood) sample or any excretions from a subject identify a cancerous state or a precancerous state of the subject with high specificity and sensitivity. Thus, identification according to the invention is accurate and reliable. Moreover, since the biomarkers of the invention are obtained from fluid samples (e.g., serum, plasma, or blood) or from excretions (e.g., stool or urine), the methods of the invention are advantageously non-invasive.

As used herein, the terms “identification”, “identifying a subject as” and “identifies the subject as having” are interchangeable and encompass any one or more of screening for colorectal cancer; detecting the presence of, or severity of, cancer; prognosis of cancer; early diagnosis of cancer; diagnosing a precancerous advanced polyps; treatment efficacy and/or relapse of cancer; as well as a platform for selecting therapy and/or a treatment for cancer, optimization of a given therapy for cancer, and/or predicting the suitability of a therapy for specific subjects (e.g., patients) or subpopulations or determining the appropriate dosing of a therapeutic product in patients or subpopulations. Each possibility is a separate embodiment of the present invention.

According to some embodiments, the subject is a human subject.

According to some embodiments the sample obtained from the subject is a body fluid or excretion sample including, but not limited to, seminal plasma, blood, peripheral blood, serum, urine, prostatic fluid, seminal fluid, semen, the external secretions of the skin, respiratory, intestinal, and genitourinary tracts, tears, cerebrospinal fluid, sputum, saliva, milk, peritoneal fluid, pleural fluid, peritoneal fluid, cyst fluid, lavage of body cavities, broncho alveolar lavage, lavage of the reproductive system and/or lavage of any other organ of the body or system in the body and stool. Each possibility is a separate embodiment of the present invention.

According to some embodiments, obtaining a biological sample comprising tissue or fluid is carried out by any one or more of the following collection methods blood sampling, urine sampling, stool sampling, sputum sampling, aspiration of pleural or peritoneal fluids, fine needle biopsy, needle biopsy, core needle biopsy and surgical biopsy, and lavage. Each possibility is a separate embodiment of the present invention. Regardless of the procedure employed, once a biopsy/sample is obtained the level of the biomarkers can be determined and diagnosis can thus be made.

According to some embodiments, the sample obtained from the subject is peripheral blood.

According to some embodiments, the term “peripheral blood”, as used herein, refers to blood comprising of red blood cells, white blood cells and platelets. Typically, the sample is a pool of circulating blood. According to some embodiments, the sample is a peripheral blood sample not sequestered within the lymphatic system, spleen, liver, or bone marrow.

According to some embodiments, the sample is a plasma sample. According to some embodiments, the sample is a plasma sample derived from peripheral blood.

According to some embodiments, the plurality of biomarkers described herein, optionally includes any sub-combination of biomarkers, and/or a combination featuring at least one other biomarker, for example a known biomarker.

According to some embodiments, as described herein, the plurality of biomarkers is correlated with colorectal cancer.

According to some embodiments, the term “a plurality”, as used herein, refers to at least two. According to some embodiments, the term “a plurality” refers to at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 and 17.

According to some embodiments, “measuring the expression levels” comprises assessing the presence, absence, quantity or relative amount (which can be an “effective amount”) of either a given substance, typically an mRNA or a cDNA, within a clinical or subject-derived sample, including qualitative or quantitative concentration levels of such substances, or otherwise evaluating the values or categorization of a subject's clinical parameters.

According to some embodiments, “measuring the expression levels” comprises determining the mRNA expression levels of said plurality of biomarkers or determining the amount, or relative amount, of cDNA corresponding to the expression level of the mRNA biomarker(s).

According to some embodiments, the cutoff value of the biomarker refers to an expression level which differentiates the population of healthy subjects from the population of non-healthy subject. According to some embodiments, the level of each biomarker set forth in SEQ ID NO: 1 to 17 is below the cutoff value of each of said biomarker in a population of healthy subject.

According to some embodiments, the cutoff value is a statistically significant value. According to some embodiments, the p value of the cutoff value is at most 0.05. According to some embodiments, an expression level of at least one biomarker above or below said cutoff value of said at least biomarker determines the CRC state of the subject.

According to some embodiments, determining the cutoff value for each biomarker includes measuring the expression level of said at least one biomarker in a large population of subjects that are either healthy, have precancerous advanced polyps or have colorectal cancer.

According to some embodiments, the methods of the invention further comprise reverse transcribing each of the mRNA biomarkers and obtaining the corresponding complimentary DNA (cDNA). According to some embodiments, measuring of the quantity of each cDNA is performed by quantitative polymerase chain reaction (qPCR).

According to some embodiments, the expression levels are measured by quantitative real-time PCR (qRT-PCR).

According to some embodiments, the pair of oligonucleotides are preferably selected to have compatible melting temperatures (Tm), e.g., melting temperatures which differ by less than that 7° C., preferably less than 5° C., more preferably less than 4° C., most preferably less than 3° C., ideally between 3° C. and 0° C.

As used herein, quantitative polymerase chain reaction (qPCR) is a method of quantitatively measuring the amplification of DNA using fluorescent probes. This technology utilizes oligonucleotides probes that have a fluorescent probe attached to the 5′ end and a quencher to the 3′ end. During PCR amplification, these probes hybridize to the target sequences located in the amplicon and as polymerase replicates the template with the probe bound, it also cleaves the fluorescent probe due to polymerase 5′-nuclease activity. Due to the close proximity between the quench molecule and the fluorescent probe normally prevents fluorescence from being detected, the decoupling results in the increase of intensity of fluorescence proportional to the number of the probe cleavage cycles.

According to some embodiments, the length of the segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and, therefore, this length is a controllable parameter. Because the desired segments of the target sequence become the dominant sequences (in terms of concentration) in the mixture, they are said to be “PCR-amplified”. Many variables can influence the mean efficiency of PCR, including target DNA length and secondary structure, primer length and design, primer and dNTP concentrations, and buffer composition, to name but a few. Contamination of the reaction with exogenous DNA (e.g., DNA spilled onto lab surfaces) or cross-contamination is also a major consideration. These reaction conditions must be carefully optimized for each different primer pair and target sequence.

According to some embodiments, determining the expression levels of the biomarkers may comprise detection of the expression or expression levels of specific nucleic acid sequences via any means known in the art, and as described herein.

According to some embodiments, determining the quantity and/or concentration of cDNA or mRNA is performed by employing at least one probe or at least one primer, preferably a primer pair. Typically, the nucleic acid probe or primer is suitable for detecting the expression or expression levels of a specific biomarker of the present invention.

As used herein, a “primer” defines an oligonucleotide which is capable of annealing to (hybridizing with) a target sequence, thereby creating a double stranded region which can serve as an initiation point for DNA synthesis under suitable conditions.

According to some embodiments, the terminology “primer pair” refers herein to a pair of oligonucleotides (oligos) according to at least some embodiments of the present invention, which are selected to be used together in amplifying a selected nucleic acid sequence by one of a number of types of amplification processes, preferably a polymerase chain reaction. Other types of amplification processes include ligase chain reaction, strand displacement amplification, or nucleic acid sequence-based amplification, as explained in greater detail below. As commonly known in the art, the oligos are designed to bind to a complementary sequence under selected conditions.

According to some embodiments of the present invention, oligonucleotide primers may be of any suitable length, depending on the particular assay format and the particular needs and targeted genomes employed. Optionally, the oligonucleotide primers are at least 12 nucleotides in length, preferably between 15 and 24 molecules, and they may be adapted to be especially suited to a chosen nucleic acid amplification system. As commonly known in the art, the oligonucleotide primers can be designed by taking into consideration the melting point of hybridization thereof with its targeted sequence (Sambrook et al., 1989, Molecular Cloning—A Laboratory Manual, 2nd Edition, CSH Laboratories).

According to some embodiments, the expression levels of the biomarkers of the present invention are determined using the primers listed in Table 2.

According to some embodiments, the “sensitivity” of a diagnostic assay is the percentage of diseased individuals who test positive (percent of “true positives”). Diseased individuals not detected by the assay are “false negatives”. Subjects who are not diseased and who test negative in the assay are termed “true negatives.” The “specificity” of the diagnostic assay is one (1) minus the false positive rate, where the “false positive” rate is defined as the proportion of those without the disease who test positive. While a particular diagnostic method may not provide a definitive diagnosis of a condition, it suffices if the method provides a positive indication that aids in diagnosis.

According to some embodiments, the method disclosed herein distinguishes a disease or condition (particularly colorectal cancer) with a sensitivity of at least 19% at a specificity of at least 97% when compared to normal subjects (e.g., a healthy individual not afflicted with cancer). According to some embodiments, the method distinguishes a disease or condition with a sensitivity of at least 44% at a specificity of at least 92% when compared to normal subjects. According to some embodiments, the method distinguishes a disease or condition with a sensitivity of at least 56.5% at a specificity of at least 79% when compared to normal subjects. According to some embodiments, the method distinguishes a disease or condition with a sensitivity of at least 58% at a specificity of at least 92% when compared to subjects exhibiting symptoms that mimic disease or condition symptoms. According to some embodiments, the method distinguishes a disease or condition with a sensitivity of at least 66% at a specificity of at least 78% when compared to normal subjects. According to some embodiments, the method distinguishes a disease or condition with a sensitivity of at least 100% at a specificity of at least 85% when compared to normal subjects. According to some embodiments, the method distinguishes a disease or condition with a sensitivity of at least 56.5% at a specificity of at least 79% when compared to normal subjects. According to some embodiments, the method distinguishes precancerous advanced polyps with a sensitivity of at least 53% and colorectal cancer with a sensitivity of at least 87.5% at a specificity of at least 81% when compared to normal subjects.

According to some embodiments, the term “relative quantity” of a biomarker refers to an amount of a biomarker in a subject's sample that is consistent with diagnosis of a particular disease or condition. A relative quantity can be either in absolute amount (e.g., microgram/ml) or a relative amount (e.g., relative intensity of signals).

According to some embodiments, individual biomarkers and/or combinations of biomarkers may optionally be used for diagnosis of time of onset of a disease or condition. Such diagnosis may optionally be useful for a wide variety of conditions, including those conditions with an abrupt onset.

The skilled artisan will understand that associating an indicator with a predisposition to an adverse outcome is a performance (sensitivity & specificity) analysis. For example, an RNA biomarker expression level of greater than a pre-set cutoff value may signal that a patient is having CRC whereas an RNA biomarker expression level less than or equal to the pre-set cutoff value may indicate that a subject is healthy, or not having CRC.

Additionally, a change in biomarker concentration from baseline levels may be reflective of the status of a disease or its progression (if temporal monitoring is involved), or of the therapeutic effect of a treatment whereas the degree of change in biomarker expression level may be related to the severity of CRC. Statistical significance is often determined by comparing two or more populations, and determining a confidence interval (CI) and/or a p value.

According to some embodiments, the confidence intervals (CI) of the invention are 90%, 95%, 97.5%, 98%, 99%, 99.5%, 99.9% and 99.99%, while preferred p values are less than 0.1, 0.05, 0.025, 0.02, 0.01, 0.005, 0.001 or less than 0.0001. Exemplary statistical tests for identifying CRC and precancerous advanced polyps are described hereinafter.

According to some embodiments, the detection of a nucleic acid of interest in a biological sample may be carried out by any method known in the art. Optionally detection of a nucleic acid of interest is effected by hybridization-based assays using an oligonucleotide probe. Traditional hybridization assays include PCR, reverse-transcriptase PCR, Real-time PCR, quantitative PCR, quantitative real-time PCR, RNase protection, in-situ hybridization, primer extension, dot or slot blots (RNA), and Northern blots (i.e., for RNA detection). Other detection methods include kits containing probes on a dipstick setup and the like.

According to some embodiments, probes may be labeled according to numerous well known methods. Non-limiting examples of detectable markers include ligands, fluorophores, chemiluminescent agents, enzymes, and antibodies. Other detectable markers for use with probes, which can enable an increase in sensitivity of the method of the invention, include biotin and radio-nucleotides. It will become evident to the person of ordinary skill that the choice of a particular label dictates the manner in which it is bound to the probe.

According to some embodiments, the probes are selected from the probes listed in Table 2.

According to some embodiments, the probe oligonucleotides may be labeled subsequent to synthesis, by incorporating biotinylated dNTPs or rNTP, or some similar means (e.g., photo-cross-linking a psoralen derivative of biotin to RNAs), followed by addition of labeled streptavidin (e.g., phycoerythrin-conjugated streptavidin) or the equivalent. Alternatively, when fluorescently-labeled oligonucleotide probes are used, fluorescein, FAM, lissamine, phycoerythrin, rhodamine, Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, FluorX and others can be attached to the oligonucleotides. Preferably, detection of the biomarkers of the invention is achieved by using TaqMan assays, preferably by using combined reporter and quencher molecules (Roche Molecular Systems Inc.).

According to some embodiments, detection of a nucleic acid of interest in a biological sample may also optionally be effected by NAT-based assays, which involve nucleic acid amplification technology, such as PCR for example (or variations thereof such as qPCR for example).

Amplification of a selected, or target, nucleic acid sequence may be carried out by a number of suitable methods. Numerous amplification techniques have been described and can be readily adapted to suit particular needs of a person of ordinary skill. Non-limiting examples of amplification techniques include polymerase chain reaction (PCR), ligase chain reaction (LCR), strand displacement amplification (SDA), transcription-based amplification, the q3 replicase system and NASBA.

According to some embodiments, a nucleic acid sample from a subject is amplified under conditions which favor the amplification of the most abundant differentially expressed nucleic acid. According to some embodiments, reverse transcription into cDNA is carried out on an mRNA sample from a patient. According to some embodiments, the amplification of the differentially expressed nucleic acids is carried out simultaneously. It will be realized by a person skilled in the art that such methods could be adapted for the detection of differentially expressed proteins instead of differentially expressed nucleic acid sequences.

According to some embodiments, the nucleic acid (e.g., mRNA) for practicing the present invention may be obtained according to well known methods.

According to some embodiments, detection may also optionally be performed with a chip or other such device. The nucleic acid sample which includes the candidate region to be analyzed is optionally isolated, amplified and labeled with a reporter group. This reporter group may be a fluorescent group such as phycoerythrin. The labeled nucleic acid is then incubated with the probes immobilized on the chip using a fluidics station. Once the reaction is completed, the chip is inserted into a scanner and patterns of hybridization are detected. The hybridization data is collected, as a signal emitted from the reporter groups already incorporated into the nucleic acid, which is now bound to the probes attached to the chip. Since the sequence and position of each probe immobilized on the chip is known, the identity of the nucleic acid hybridized to a given probe can be determined.

It will be appreciated that when utilized along with automated equipment, the above described detection methods may be used to screen multiple samples for a disease and/or pathological condition both rapidly and easily.

According to some embodiment there is provided kit for identifying colorectal cancer or precancerous advanced colorectal polyps in a biological sample, the kit comprising one or more containers filled with a nucleotide primer pair flanking a biomarker comprising a nucleic acid sequences set forth in SEQ ID NO: 1, wherein said nucleotide primer pair is designed to selectively amplify a fragment of the genome of the individual in said sample that includes the biomarker.

According to some embodiments, the nucleotide primer pair is selected from the nucleotide primer pairs listed in Table 2.

According to some embodiments, said nucleotide primer pair comprises SEQ ID NOs: 40 and 41.

According to some embodiments, said biomarker further comprises at least one nucleic acid sequences selected from SEQ ID NOs: 2, 3, 5-7, 12 and 17 and said nucleotide primer pair comprises at least one of SEQ ID NOs: 30 and 31; SEQ ID NOs: 34 and 35; SEQ ID NOs: 67 and 68; SEQ ID NOs: 49 and 50; SEQ ID NOs: 52 and 53; SEQ ID NOs: 64 and 65; and SEQ ID NOs: 73 and 74, respectively.

According to some embodiments, said biomarker comprises the nucleic acid sequences set forth in SEQ ID NOs: 1-3, 5-7, 12 and 17, said nucleotide primer pair comprises SEQ ID NOs: 40 and 41; SEQ ID NOs: 30 and 31; SEQ ID NOs: 34 and 35; SEQ ID NOs: 67 and 68; SEQ ID NOs: 49 and 50; SEQ ID NOs: 52 and 53; SEQ ID NOs: 64 and 65; and SEQ ID NOs: 73 and 74 and said kit is for identifying colorectal cancer.

According to some embodiments, said biomarker is consisting the nucleic acid sequences set forth in SEQ ID NOs: 1-3, 5-7, 12 and 17, said nucleotide primer pair comprises SEQ ID NOs: 40 and 41; SEQ ID NOs: 30 and 31; SEQ ID NOs: 34 and 35; SEQ ID NOs: 67 and 68; SEQ ID NOs: 49 and 50; SEQ ID NOs: 52 and 53; SEQ ID NOs: 64 and 65; and SEQ ID NOs: 73 and 74 and said kit is for identifying colorectal cancer.

According to some embodiments, said biomarker comprises the nucleic acid sequences set forth in SEQ ID NOs: 1 and 5, said nucleotide primer pair comprises SEQ ID NOs: 95 and 96; and SEQ ID NOs: 67 and 68, and said subject is identified as having precancerous advanced colorectal polyps.

According to some embodiments, said biomarker is consisting the nucleic acid sequences set forth in SEQ ID NOs: 1 and 5, said nucleotide primer pair comprises SEQ ID NOs: 40 and 41; and SEQ ID NOs: 67 and 68 and said subject is identified as having precancerous advanced colorectal polyps.

According to some embodiments, said biomarker comprises SEQ ID NO: 1 and at least one nucleic acid sequences selected from SEQ ID NOs: 3, 4, 6 and 14, and said nucleotide primer pair comprises SEQ ID NOs: 40 and 41; and at least one of SEQ ID NOs: 34 and 35; SEQ ID NOs: 55 and 56; SEQ ID NOs: 49 and 50; SEQ ID NOs: 61 and 62, respectively.

According to some embodiments, said biomarker comprises SEQ ID NOs: 1 and 4 and at least one nucleic acid sequences selected from SEQ ID NOs: 3, 6 and 14 and said nucleotide primer pair comprises SEQ ID NOs: 40 and 41 and SEQ ID NOs: 55 and 56; and at least one of SEQ ID NOs: 34 and 35; SEQ ID NOs: 49 and 50; SEQ ID NOs: 61 and 62, respectively.

According to some embodiments, said biomarker comprises SEQ ID NOs: 1, 3 and 4 and said nucleotide primer pair comprises SEQ ID NOs: 40 and 41; SEQ ID NOs: 34 and 35 and SEQ ID NOs: 55 and 56.

According to some embodiments, said biomarker comprises SEQ ID NOs: 1, 4, 6 and 14 and said nucleotide primer pair comprises SEQ ID NOs: 40 and 41 SEQ ID NOs: 55 and 56; SEQ ID NOs: 49 and 50; and SEQ ID NOs: 61 and 62.

According to some embodiments, said biomarker comprises SEQ ID NOs: 1, 3, 4 and 14 and said nucleotide primer pair comprises SEQ ID NOs: 40 and 41; SEQ ID NOs: 34 and 35; SEQ ID NOs: 55 and 56; and SEQ ID NOs: 61 and 62.

According to some embodiments, the terms “cancer” and “colorectal cancer” are interchangeable.

According to some embodiments, the cancer is invasive. According to other embodiments, the cancer is non-invasive. According to yet other embodiments, the cancer is non metastatic. According to some embodiments, the cancer is metastatic. According to some embodiments, the cancer is a metastasis of colorectal cancer.

According to some embodiments, the kits and methods of the invention are used for monitoring individuals who are at high risk for colorectal cancer, such as, those who have been diagnosed in the past with localized disease, metastasized disease or those who are genetically linked to the disease, or those who have family members of first and second degree diagnosed in the past with cancer. Individuals with a history of inflammatory conditions of the colon such as ulcerative colitis or Crohn's colitis may also be considered as individuals who are in high risk groups for colorectal cancer. Molecular diagnostics according to the present invention may be used for monitoring individuals who are undergoing, or have been treated for, colorectal cancer, in order to determine if the cancer has been eliminated. Screening and diagnostic kits and methods according to the present invention may be used in the monitoring of individuals who have been identified as genetically predisposed such as by genetic screening and/or family histories. Screening and diagnostic kits and methods according to the present invention may be used in the monitoring of asymptomatic individuals whether or not identified as genetically predisposed.

The invention is useful for identifying individuals who show at least one symptom or characteristic of cancer, e.g. presence of polyps in the colon.

According to some embodiments, the present invention is used for monitoring individuals who have been identified as having family medical histories which include relatives who have suffered from colorectal cancer. Likewise, the invention is particularly useful to monitor individuals who have been treated and had tumors removed or are otherwise experiencing remission.

According to some embodiments, the present invention further provides a method for treating a subject having colorectal cancer, the method comprising identifying a subject having colorectal cancer or precancerous advanced colorectal polyps, and treating said subject, wherein treating comprises at least one of administering a chemotherapeutic agent, performing bowel resection, applying radiation therapy and a combination thereof.

According to some embodiments, the chemotherapeutic agents, includes, but is not limited to, 5-fluorouraeii, leucovorin, or oxaliplatin or capecitabine; and/or a monoclonal antibody, such as bevacizumab, cetuximab, or pamtunvumab, or alternative monoclonal antibody, or a combination thereof. Each possibility is a separate embodiment of the present invention.

According to some embodiments, treating a subject for precancerous advanced polyps comprises removal of said precancerous advanced polyps.

According to some embodiments, removal of said precancerous advanced polyps comprises performing one or more of colonoscopy, flexible sigmoidoscopy and open surgery. Each possibility is a separate embodiment of the present invention.

According to some embodiments, the identification, diagnosis, early diagnosis and/or prognosis of said subject according to the present invention enables a man skilled in the art (i.e., clinician or physician) to determine and/or manage the subject treatment regimen. Managing subject treatment includes determination of the severity of the cancerous state (e.g., cancer status). For example, if the severity of the cancerous state indicates that surgery is appropriate, the physician may schedule the patient for surgery. Likewise, if the severity of the cancerous state indicates late stage cancer or if the status is acute, no further action may be warranted. Furthermore, if the results show that treatment has been successful, no further management or treatment may be necessary. Alternatively, if the result of the methods of the present invention is inconclusive or there is reason that confirmation of status is necessary, the physician may order more diagnostic tests. Accordingly, patients that are found to have at least one biomarker with an expression level above the cutoff value that identifies them as having colorectal cancer may undergo additional diagnostic procedures.

As used herein, a “subject” commonly refers to mammalian subject. A mammalian subject may be human or non-human, preferably human.

According to some embodiments, a healthy subject is defined as a subject without detectable colorectal diseases or symptoms, colorectal associated diseases or precancerous advanced polyps, determined by conventional diagnostic methods.

Various embodiments and aspects of the present invention as delineated hereinabove and as claimed in the claims section below find experimental support in the following examples.

EXAMPLES Example 1 Study Population and Specimen Preparation

Subjects at least 50 years old and scheduled for colonoscopy were participated in the study. To ensure that only average risk individuals were enrolled, the following were excluded from the study: previous CRC or adenomas; iron deficiency anemia or haematochezia (blood in the stool) within the previous 6 months; or family history indicating increased risk for the disease (two or more first degree relatives with CRC or one or more with CRC at age 50 years or less; or known Lynch syndrome or familial adenomatous polyposis).

Colonoscopy procedures, including polypectomy and biopsy, were performed by board certified endoscopists using screening standards and site specific standards for sedation, monitoring, imaging and equipment. Histopathology, diagnostic procedures, and staging of biopsy and surgical specimens used routine procedures. Samples from 137 subjects were available for selection into laboratory analysis, including 55 normal subjects, 47 with advanced adenomas and 35 with CRC. The clinical as well as histological parameters of the study groups are depicted in Table 1A.

TABLE 1A Advanced Normal Adenoma CRC N = 55 N = 47 N = 35 (%) (%) (%) Age <50  13 (24) 2 (4) 5 (14) 50-<60 17 (31) 12 (26) 5 (14) 60-<70 15 (27) 15 (32) 10 (29) 70-<80 9 (16) 16 (34) 14 (40) 80+ 1 (2) 2 (4) 1 (3) Gender Male 30 (55) 29 (62) 19 (54) Female 25 (45) 18 (38) 16 (46) Location Rectum 4 (9) 11 (32) Left 16 (34) 12 (34) Right 25 (53) 12 (34) Unknown (UK) 2 (4) Size  <1 cm 12 (26) >=1 cm 35 (74) >=3 cm Villous component + 28 (60) − 19 (40) Tumor Differentiation (TD) Level Well 5 (14) Moderate 20 (57) Poor 5 (14) UK 5 (14) Stage I 5 (14) II 18 (51) III 11 (31) IV 0 (0) UK 1 (3)

Following the consent of patients recruited for the study, about 10 ml of blood were provided before surgery or colonoscopy using a collection tube (vacutainer). The collected blood was kept refrigerated until further processing, up to 4 hours following the collection.

Plasma was separated from blood cells by centrifugation. The plasma was homogenized with TRIzol® Reagent (Invitrogen). Each volume of plasma was mixed with 3.5 volumes of TRIzol reagent. The mixture was divided into storage micro tubes and stored at −80° C. until further purification.

Total RNA extraction was performed according to the following protocol: 300 μl chloroform (119.38 g/mol) was added to each of four micro-tubes containing TRIzol™-plasma mixture of the same individual. The solution was mixed vigorously and incubated for 10 minutes at room temperature. Subsequently, the mixture was centrifuged for 15 minutes at 14,000 rpm at 4° C. The aqueous phase was transferred to a new tube and mixed vigorously with equal volume of chloroform, incubated for 3 minutes at room temperature and centrifuged for 15 minutes at 14,000 rpm at 4° C. Following the centrifugation, the upper phase was transferred to a new micro tube; next, a total of 1.4 ml RLT buffer from RNeasy™ mini kit (Qiagen) was added and tubes were mixed. Thereafter, 1.5 times volume of 100% EtOH per each separated upper phase was added. The solution was well mixed and incubated at −20° C. for overnight. Following this incubation, the solution was thawed and 700 μl of the mixture was loaded on an RNeasy™ spin column (Qiagen) and micro-centrifuged at 23° C., 10,000g for 30 seconds and flow-through was discarded. The rest of the thawed sample was loaded and the column centrifuged, as described above until all the solution was filtered through RNeasy™ spin column. Further RNA purification was completed by following the RNeasy™ mini kit protocol (Qiagen). In short, spin column was loaded with the sample and was washed twice with 500 μl of RPE buffer were. Finally, RNA was eluted by adding 35 μl of RNase-free water. For complete re-suspension of the RNA, the eluted RNA was incubated firstly for 5 minutes in a heat block at 65° C. and secondly incubated on ice for 5 minutes, and span down. RNA quantity was measured using NanoDropTM instrument (Thermo Scientific).

In order to test for gene expression profile using gene expression chip array, total RNA was purified using TRIzol-plasma mixture of the same individual, thawed on ice and 15 mg of linear acrylamide and 200 μl of chloroform were added per each 1 ml of Trizol and mixed vigorously. After 10 minutes incubation at room temperature, the mixture was centrifuged for 15 minutes at 14000 rpm at 4° C. The aqueous phase was isolated and further RNA purification steps were performed as above described for RNA specimen preparation for qPCR.

For testing gene expression levels by qPCR, 10 microliter of plasma RNA was used for each cDNA reaction. The Reverse Trascriptase reaction was performed with qScript buffer mix and RT enzyme. The produced cDNA was stored at −20° C. For gene expression profiling using Affymetrix expression microarray, cDNA was synthesized, purified and was subjected to fragmentation and biotin labeling.

Example 2 Quantification of Expression Levels

Initially, 72 genes were tested for their expression levels in the different subpopulations of which 17 genes (Table 1B) were selected to the panel of biomarkers for the detection of colorectal cancer.

TABLE 1B SEQ Gene Gene ID Corresponding SEQ ID Accession No. NO: Nucleic acid sequence gene NO: (Genebank)  1 CCT TAC AGC AAC AGA AAG TGA CHD2 75 NH_001271.3 AGG GCC TAA AAA AAC TAG AGA ACT TCA AGA AAA AAG AGG ACG AAA TCA AAC AAT GGT TAG GGA AAG TTT CTC CTG AAG ATG TAG AAT ATT TCA ATT GCC AAC AGG AGC TGG CTT CAG  2 AGG ATG AGT GAC GAG TTT GTG BAD 76 NH_032989.2 GAC TCC TTT AAG AAG GGA CTT CCT CGC CCG AAG AGC GCG GGC ACA GCA ACG CAG ATG CGG C AAA GCT CCA GCT GGA CGC GAG TCT TCC AGT CCT GGT GGG ATC GGA ACT TGG GCA G  3 CCG TGC TGC TCA CCA AAG GTG BAMBI 77 NM_012342.2 AAA TTC GAT GCT ACT GTG ATG CTG CCC ACT GTG TAG CCA CTG GTT ATA TGT GTA AAT CTG AGC  4 GGA AGA GGT ATG GGA GGA CAT HNRNPH3 78 NM_012207.2 GGC TAT GGT GGA GCT GGT GAT GCA AGT TCA GGT TTT CAT GGT GGT CAT TTC GTA CAT ATG AGA GGG TTG CCT TTT CGT GCA ACT GAA AAT GAC ATT GCT AAT TTC TTC TCA CCA CTA AAT CCA ATA CG  5 CGC CCT ACT ACA TGT CAC CGG NEK6 79 NM_014397.5 AGA GGA TCC ATG AGA ACG GCT ACA ACT TCA AGT CCG ACA TCT GGT CCC TGG GCT GTC TGC TGT ACG AGA TGG CAG CCC TCC AGA GCC CCT TCT ATG GAG ATA AGA TGA ATC TCT TCT CCC TGT GCC A  6 AGC CTA TGA ATT CTA CCA TGC GCT EPAS1 80 NH_001430.4 AGA CTC CGA GAA CAT GAC CAG AAC TTG TGC ACC AAG GGT CAG GTA GTA AGT GGC CAG TAC CGG ATG CTC GCA AAG  7 TGA AGA TGG AGG CAT TAT CCG FKBP5 81 NH_001145776.1 GAG AAC CAA ACG GAA AGG AGA GGG ATA TTC AAA TCC AAA CGA AGG AGC AAC AGT AGA AAT CCA CCT GGA AGG CCG CTG TGG TGG AAG GAT GTT TGA CTG CAG AGA TGT GGC ATT CAC TGT G  8 TGG CTC TCC TTG TCA TTT TCC AGG CCR7 82 NH_001838.3 TAT GCC TGT GTC AAG ATG AGG TCA CGG ACG ATT ACA TCG GAG ACA ACA CCA CAG TGG ACT ACA CTT TGT TCG AGT CTT TGT GCT CCA AGA AGG ACG TGC GGA ACT TTA A  9 CCA GTG GAA CTT TAG ACC TCA COX11 83 NH_001162861.1 GCA AAC AGA AAT ATA TGT GGT GCC AGG AGA GAC TGC ACT GGC GTT TTA CAG AGC TAA GAA TCC TAC TGA CAA ACC AGT AAT TGG AAT TTC TAC ATA CAA TAT TGT TCC ATT TGA AGC TGG ACA GTA TTT 10 CAA CAC CTT CCA CCA ATA CTC TGT S100A9 84 NH_002965.3 GAA GCT GGG GCA CCC AGA CAC CCT GAA CCA GGG GGA ATT CAA AGA GCT GGT GCG AAA AGA TCT GCA AAT TTT CTC AAG AAG GAG AAT AAG AAT GAA AAG GTC ATA GAA CAC ATC ATG GAG G 11 GTC ATC AAG CAC CTG AAC AGG CHPT1 85 NH_020244.2 TTC AAG TTC TTT CTT CAA AGA GTC ATC AGA ATA ACA TGG ATT GAA GAG ACT TCC GAA CAC TTG CTA TCT CTT GCT GCT GCT GTT TCA TGG AAG GAG A 12 CTC CCA TCT CAA AGC CCA TTA CAG KLF9 86 NH_001206.2 AGT GCA TAC AGG TGA ACG GCC CTT TCC CTG CAC GTG GCC AGA CTG CCT TAA AAA GTT CTC C 13 GTT TTC AAT GAG TAC CAG AGA ANXA11 87 NH_145868.1 ATG ACA GGC CGG GAC ATT GAG AAG AGC ATC TGC CGG GAG ATG TCC GGG GAC CTG GAG GAG GGC ATG CTG GCC GTG GTG AAA TGT CTC AAG AAT ACC CCA GCC TTC TTT GCG GAG AGG CTC AAC AAG GCC 14 GAC CCA CCC ACA TAC ATC AGG KIAA1199 88 NM_018689.1 GAC CTC TCC ATC CAT CAT GCT GCG TCA CAG TCC ATG GCT CCA ATG GCT TGT TGA TCA AGG ACG TTG TGG GCT ATA ACT CTT TGG G 15 TCT GCC ACT AAT TCG ACA TCA GTT KIAA10101 89 NM_014736.5 TCA TCG AGG AAA GCT GAA AAT AAA TAT GCA GGA GGG AAC CCC GTT TGC GTG CGC CCA ACT CCC AAG TGG CAA AAA GGA ATT GGA GAA 16 AAT GAG TTC CTT CTA CAG TCA GAT ARHGAP15 90 NH_018460.3 ATT GAC TTC ATC ATA TTG GAT TGG TTC CAC GCT ATC AAA AAT GCA ATT GAC AGA TTG CCA AAG GAT TCA AGT TGT CCA TCA AGA AAC CTG GAA TTA TTC AAA ATC CAA AGA TCC TCT AGC ACT GAA 17 CAG GAA GAT GGG CAA GAT GAT SASH3 91 NM_018990.3 GGT GAA GGC CCT GTC AGA AGA GAT GGC AGA CAC TCT GGA GGA GGG CTC TGC CTC CCC GAC ATC TCC AGA CTA CAG CCT

Subsequently, the required volume of cDNA was diluted ×4, of which 2 μl were used for qPCR. For a typical qPCR reaction the PerfeCTa qPCR SuperMix (catalog # 95065, Quanta) was used together with forwards and reverse primers (Table 2) set specific for each gene, hydrolysis probes and diluted cDNA in a final volume of 20 qPCR was performed in a 96 well PCR plate, for 52 cycles at Quanta's specified conditions, in ABI Prism 7900 system. The probes, fluorescently labeled, listed in Table 2 include one or more of the following labels: FAM at the 5′ end (also known as 56-FAM), IABkFQ at the 3′ end and may further include N,N-diethyl-4-(4-nitronaphthalen-l-ylazo)-phenylamine (also known as ‘ZEN’). ZEN may be incorporated at any position. For example, ZEN may be incorporated at position 9 from the 3′ end, position 10 from the 3′ end, or in the middle of the probe (such that about the same number of nucleotides are stretched at the 3′ and 5′ directions counting from the ZEN position). The reference genes for normalization were human HPRT1 and human TFRC. Delta-delta Ct and relative quantification for each gene was calculated by DataAssist v3.0. Reference genes primers and probe sequences are as followed: hHPRT1 gene, forwards primer—TATGCTGAGGATTTGGAAAGG (SEQ ID NO: 18), reverse primer—CATCTCCTTCATCACATCTCG (SEQ ID NO: 19; final concentration 300nM) probe-FAM-TATGGACAGGACTGAACG-3′IABkFQ (SEQ ID NO: 20) with addition of 4 LNAs (final concentration 200nM). hTFRC forwards primer—TTGCATATTCTGGAATCCCA (SEQ ID NO: 21), reverse primer—TCAGTTCCTTATAGGTGTCCATG (SEQ ID NO: 22; final concentration 500nM), probe—FAM-TCTGTGTCCTCGCAAAAA-3′IABkFQ (SEQ ID NO: 23) with addition of 5 LNAs (final concentration 250nM). An exemplary flow chart of the process is shown in FIG. 1.

Determining primers and probe final concentration for the cDNA was carried out with 100 fold range calibration curve in 6 cDNA dilutions. Primers and probe concentration which showed the calibration curve optimal slope (−3.3) at accuracy of R2>0.95 were chosen as the optimal concentration for each gene (FIGS. 2A-2B).

TABLE 2 SEQ ID Gene Name Primer/Probe Sequences NO. ANXA11 87 Probe TGG CCG TGG TGA AAT GTC TCA AGA 24 Primer 1 (Fw) GGC CTT GTT GAG CCT CTC 25 Primer 2 (Rev) GTT TTC AAT GAG TAC CAG AGA ATG 26 AC ARHGAP15 90 Probe CAG ATT GCC AAA GGA TTC AAG TTG 27 TCC A Primer 1 (Fw) TTC AGT GCT AGA GGA TCT TTG G 28 Primer 2 (Rev) AAT GAG TTC CTT CTA CAG TCA GAT 29 BAD 76 Probe CTG GAG CTT TGC CGC ATC TGC 30 Primer 1 (Fw) AGG ATG AGT GAC GAG TTT GTG 31 Primer 2 (Rev) CTG CCC AAG TTC CGA TCC 32 BAMBI 77 Probe TTC GAT GCT ACT GTG ATG CTG CCC 33 Primer 1 (Fw) CCG TGC TGC TCA CCA AA 34 Primer 2 (Rev) GCT CAG ATT TAC ACA TAT AAC CAG 35 TG CCR7 82 Probe TG ACC TCA TC TTG ACA CAG GCA 36 TAC C Primer 1 (Fw) TTA AAG TTC CGC ACG TCC TT 37 Primer 2 (Rev) TGG CTC TCC TTG TCA TTT TCC 38 CHD2 75 Probe CGA AAT CAA ACA ATG GTT AGG 39 GAA AGT TTC TCC Primer 1 (Fw) CCT TAC AGC AAC AGA AAG TGA AG 40 Primer 2 (Rev) CTG AAG CCA GCT CCT GTT 41 CHPT1 85 Probe AGC AAG TGT TCG GAA GTC TCT TCA 42 ATC C Primer 1 (Fw) TCT CCT TCC ATG AAA CAG CAG 43 Primer 2 (Rev) GTC ATC AAG CAC CTG AAC AG 44 COX11 83 Probe AAA ACG CCA GTG CAGTCT CTC CT 45 Primer 1 (Fw) CCA GTG GAA CTT TAG ACCTCA G 46 Primer 2 (Rev) AAA TACTGT CCA GCT TCA AAT GG 47 EPAS1 80 Probe AGA GTC ACC AGA ACT TGT GCA CCA 48 A Primer 1 (Fw) AGC CTA TGA ATT CTA CCA TGC G 49 Primer 2 (Rev) CTT TGC GAG CAT CCG GTA 50 FKBP5 81 Probe TC AAA CAT CC TTC CAC CAC AGC GG 51 Primer 1 (Fw) CAC AGT GAA TGC CAC ATC TCT 52 Primer 2 (Rev) TGA AGA TGG AGG CAT TAT CCG 53 HNRNPH3 78 Probe TTC AGG TTT TCA TGG TGG TCA TTT 54 CG Primer 1 (Fw) GGA AGA GGT ATG GGA GGA CA 55 Primer 2 (Rev) CGT ATT GGA TTT AGT GGT GAG 56 AAG KIAA0101 89 Probe AAA CGG GGT TCC CTC CTG CAT ATT 57 Primer 1 (Fw) TCT GCC ACT AAT TCG ACA TCA G 58 Primer 2 (Rev) CTC CAA TTC CTT TTT GCC ACT T 59 KI4A1199 88 Probe CCT CTC CAT CCA TCA TAC ATT CTC 60 TCG CT Primer 1 (Fw) GAC CCA CCC ACA TAC ATC AG 61 Primer 2 (Rev) CCC AAA GAG TTA TAG CCC ACA A 62 KLF9 86 Probe AG TGC ATA CA GGT GAA CGG CCC 63 Primer 1 (Fw) GGA GAA CTT TTT AAG GCA GTC TG 64 Primer 2 (Rev) CTC CCA TCT CAA AGC CCA TT 65 NEK6 79 Probe AG GAT CCA TG AGA ACG GCT ACA 66 ACT TC Primer 1 (Fw) TGG CAC AGG GAG AAG AGA T 67 Primer 2 (Rev) CGC CCT ACT ACA TGT CAC C 68 S100A9 84 Probe AG CTC TTT GA ATT CCC CCT GGT 69 TCA Primer 1 (Fw) CCT CCA TGA TGT GTT CTA TGA CC 70 Primer 2 (Rev) CAA CAC CTT CCA CCA ATA CTC T 71 SASH3 91 Probe AGA AGA GAT GGC AGA CAC TCT 72 GGA GG Primer 1 (Fw) AGG CTG TAG TCT GGA GAT GTC 73 Primer 2 (Rev) CAG GAA GAT GGG CAA GAT GA 74

Example 3 Data Analysis

The presence of pre-cancerous polyps, adenocarcinoma of the colon or normal colon by full colonoscopy was identified based on the presence or absence of the specific molecular markers and their combinations, as schematically exemplified in FIGS. 2A-2B. For all statistical analysis SPSS package, version 21 (IBM SPSS Statistics) was applied.

Initially, blood was collected from subjects that underwent colonoscopy. Thereby, the results of the colonoscopy and the pathology report for cases where a biopsy sample was taken, or pathology report for carcinoma cases, were used as a reference for the state of the study group. The methodology was also used to identify gene combinations that can provide an optimal biomarker of advanced adenoma and cancer disease states. As detailed above, the study cohort (Table 1A) was designed to consist of 3 subject groups of normal subjects (n=55), advanced adenoma (AA; n=47) and colorectal cancer (CRC; n=35).

Normalization of gene expression by qPCR was based on the expression of two reference genes stably expressed in the plasma: HPRT1 and human TFRC. Primer-probe ratio was calibrated for low RNA amounts, yielding optimal PCR efficiency in 3 orders of magnitude of cDNA concentrations (linear dynamic range).

All PCR results were recorded as Relative Quantity (RQ) calculated by the formula: RQ=2 (—DeltaCt), where the DeltaCt is the difference between the Ct measured for a candidate detector gene marker and the reference house-keeping genes hHPRR1 and TFRC.

The cutoff values were determined to ensure that all healthy subjects (Normal) fall below it. The cutoff values for representative biomarkers are listed in Table 3.

TABLE 3 SEQ ID NO: Biomarker Cutoff Value 1 CHD2 >10 2 BAD >28 3 BAMBI >3.5 5 NEK6 >3.3 6 EPAS1 >0.25 7 FKBP5 >2 12 KLF9 >7 17 SASH3 >2.6

Several analytic methods were applied to determine the state of the disease, based on data derived from samples taken from healthy subjects, subjects with precancerous advanced polyps and subject with colorectal cancer.

It was further established that by taking a combination of biomarkers the sensitivity of identification of colorectal cancer is improved while not compromising the specificity. In order to compare between the different values, corresponding to expression level ranges of each biomarker in the combination, a combinatorial data analysis algorithm was applied. Once a combination of biomarkers was chosen, the expression levels of each of the biomarkers, in the combination, was compared to its cutoff value. The cutoff values of representative biomarkers are listed in Table 3. Using this algorithm, a value of 1 was assigned to a combination of biomarkers, if the expression level of each biomarker in the combination was below its predetermined cutoff value. A value of 2 was assigned to a combination of biomarkers if the expression level of at least one biomarker in said combination was above its predetermined cutoff value. The assigned values (1 or 2) are also referred herein as normalized expression levels. The normalized expression levels in healthy (N), precancerous (AD) and cancer (CA) populations of the combinations COX11, KIAA1199 and BAD (SEQ ID NOs: 9, 14 and 2; Table 4A) and CHD2 and EPAS1 (SEQ ID NOs: 1 and 6; Table 4B) are presented in FIGS. 4A and 4B, respectively, where expression levels above the cutoff are presented in bold (Tables 4A and 4B).

TABLE 4A Binary Code Clinical Normalized Expression CHD2 + Group Sample CHD2 EPAS1 EPAS1 N 3162 0.615 0.000 1 N 3166 1.496 0.000 1 N 3176 4.215 0.000 1 N 3250 1.689 0.000 1 N 3253 2.646 0.119 1 N 3254 0.958 0.071 1 N 3255 0.062 0.066 1 N 3260 10.744 2.014 2 N 3263 2.000 0.000 1 N 3265 6.761 0.000 1 N 3267 1.428 0.087 1 N 3269 0.850 0.000 1 N 3274 0.000 0.000 1 N 3275 0.378 0.000 1 N 3280 0.000 0.000 1 N 3281 14.466 2.681 2 N 3297 2.695 0.000 1 N 3363 0.000 0.254 1 N 3386 2.733 0.104 1 N 3388 3.433 0.638 1 N 3420 0.000 0.000 1 N 3421 2.420 0.492 1 N 3422 2.603 0.137 1 N 3436 2.642 0.000 1 N 3438 3.678 0.000 1 N 3454 0.000 0.000 1 — AD 3151 0.468 0.107 1 AD 3213 1.457 0.046 1 AD 3218 0.987 0.000 1 AD 3221 0.923 0.000 1 AD 3273 8.384 0.000 1 AD 3284 0.000 0.000 1 AD 3324 3.957 0.183 1 AD 3341 0.000 0.175 1 AD 3344 2.923 0.062 1 AD 3345 1.267 0.000 1 AD 3349 24.440 0.697 2 AD 3350 1.371 0.034 1 AD 3356 0.918 0.079 1 AD 3357 1.969 0.182 1 AD 3366 0.058 0.000 1 AD 3433 2.915 0.000 1 AD 3437 1.689 0.000 1 — CA 3123 7.578 0.056 1 CA 3124 13.197 0.308 2 CA 3129 5.829 0.272 1 CA 3147 17.961 0.000 2 CA 3168 10.596 0.146 2 CA 3290 5.071 0.132 1 CA 3312 7.126 0.000 1 CA 3313 38.283 1.714 2 CA 3319 1.530 0.000 1 CA 3327 1.916 0.000 1 CA 3331 9.350 0.826 2 CA 3337 3.414 0.618 1 CA 3338 9.373 0.224 1 CA 3343 10.697 0.072 2 CA 3374 0.000 0.306 1 CA 3408 5.728 4.583 2 CA 3412 0.407 0.076 1 CA 3440 13.793 0.000 2 CA 3668 18.139 0.076 2 CA 3775 13.939 0.273 2 CA 3783 52.058 0.308 2 CA 3808 16.818 0.113 2 CA 3851 21.649 0.280 2 CA 3874 18.359 0.119 2

TABLE 4B binary code COX11 + Clinical normalized expression KIAA1199 + Group Sample COX11 KIAA1199 BAD BAD N 3162 0.980 0.000 6.067 1 N 3166 2.512 0.158 18.188 1 N 3176 1.792 0.333 5.400 1 N 3250 2.104 0.147 27.796 1 N 3253 0.736 0.000 18.516 1 N 3254 2.755 0.000 14.599 1 N 3255 1.784 0.000 19.566 1 N 3260 2.309 0.000 2.433 1 N 3263 1.130 0.242 10.024 1 N 3265 4.605 0.175 53.725 2 N 3267 1.413 0.238 20.857 1 N 3269 0.723 0.000 17.216 1 N 3274 2.973 0.000 92.903 2 N 3275 0.474 0.000 4.467 1 N 3280 0.917 0.000 0.000 1 N 3281 2.034 0.000 4.253 1 N 3297 2.313 0.000 3.447 1 N 3363 0.000 0.197 0.000 1 N 3386 1.943 0.000 20.101 1 N 3388 3.940 0.513 62.090 2 N 3420 0.366 0.255 2.675 1 N 3421 0.957 0.000 13.040 1 N 3422 0.979 0.256 9.763 1 N 3436 2.485 0.178 5.612 1 N 3438 2.589 0.234 15.265 1 N 3454 1.900 0.000 11.171 1 — AD 3151 0.158 0.000 4.425 1 AD 3213 1.401 0.031 38.429 2 AD 3218 2.275 0.000 15.083 1 AD 3221 0.680 0.000 15.657 1 AD 3273 1.509 0.000 48.621 2 AD 3284 3.068 0.000 17.120 2 AD 3324 4.656 0.158 33.911 2 AD 3341 1.547 0.874 19.982 2 AD 3344 4.337 13.003 2 AD 3345 1.168 0.091 3.817 1 AD 3349 11.598 1.224 6.193 2 AD 3350 3.384 0.115 3.674 2 AD 3356 0.991 0.149 7.143 1 AD 3357 1.537 0.206 26.014 1 AD 3366 0.336 0.000 6.222 1 AD 3433 5.796 0.240 62.200 2 AD 3437 1.591 0.179 7.289 1 — CA 3123 1.941 0.000 11.771 1 CA 3124 7.232 0.416 18.660 2 CA 3129 3.624 0.419 11.240 2 CA 3147 14.408 0.238 57.370 2 CA 3168 10.313 0.037 10.152 2 CA 3290 5.633 0.408 9.478 2 CA 3312 5.958 0.545 24.071 2 CA 3313 25.183 4.720 100.340 2 CA 3319 7.096 0.358 42.461 2 CA 3327 4.907 0.000 1.694 2 CA 3331 9.995 0.435 51.915 2 CA 3337 2.163 0.339 12.153 1 CA 3338 8.872 0.576 26.613 2 CA 3343 9.892 0.223 39.434 2 CA 3374 0.000 0.000 0.000 1 CA 3408 5.382 0.000 53.354 2 CA 3412 1.377 0.611 31.927 2 CA 3440 8.658 0.000 88.402 2 CA 3668 13.485 0.214 2.661 2

Example 4 Identification of Colorectal Cancer

To identify colorectal cancer with at least one biomarker the biomarker's sensitivity towards cancer and specificity were chosen to be the highest and the sensitivity to precancerous advanced polyps is minimal. The results of single biomarker analysis, considering biomarkers with an expression level above the predetermined cutoff, are presented hereinafter, in Table 5. For example, as shown in Table 5, CHD2 (SEQ ID NO: 1) show a specificity of 97% and sensitivity of 19% in detection of colorectal cancer.

A combination or a subgroup of biomarkers may be used for identification of the subject as having colorectal cancer, while not compromising the specificity, by applying the combinatorial data analysis algorithm.

As shown in Table 6 combinatorial data analysis may increase the sensitivity of identification of two biomarkers, BAMBI (SEQ ID NO: 3) and HNRNHP3 (SEQ ID NO: 4) in comparison to the sensitivity of each of the biomarkers alone.

In Table 7 it is shown that combinatorial data analysis may increase the sensitivity of identification of two biomarkers, CHD2 (SEQ ID NO: 1) and EPAS1 (SEQ ID NO: 6) in comparison to the sensitivity of each of the biomarkers.

In Table 8 it is shown that combinatorial data analysis increases the sensitivity of identification of three biomarkers, BAMBI (SEQ ID NO: 3), HNRNPH3 (SEQ ID NO: 4) and CHD2 (SEQ ID NO: 1) in comparison to the sensitivity of each of the biomarkers.

TABLE 5 Normal Advanced polyp Cancer Clinical Group Sample analysis Sample analysis Sample analysis Total Above Above above Biomarker sample no. cutoff/Total % cutoff/Total % cutoff/Total % BAD 144 4/62 6.5% 16/46  34.8% 16/36  44.4% BAMBI 141 10/58  17.2% 8/46 17.4% 14/37  37.8% NEK6 113 5/43 11.6% 15/39  38.5% 10/31  32.3% EPAS1 108 5/36 13.9% 2/41 4.9% 16/31  51.6% FKBP5 81 6/40 15.0% 9/31 29.0% 3/10 30.0% CCR7 71 8/35 22.9% 2/22 9.1% 2/14 14.3% CHD2 140 2/59 3.4% 0/45 0.0% 7/36 19.4% COX11 106 12/51  23.5% 3/27 11.1% 5/28 17.9% S100A9 73 5/31 16.1% 4/21 19.0% 5/21 23.8% CHPT1 61 9/24 37.5% 2/16 12.5% 8/21 38.1% KLF9 98 8/44 18.2% 4/34 11.8% 5/20 25.0% ANXA11 29 4/13 30.8% 0/12 0.0% 2/4  50.0% KIAA1199 60 8/26 30.8% 0/16 0.0% 9/18 50.0% KIAA0101 55 7/31 22.6% 3/18 16.7% 0/6  0.0% ARHGAP15 52 7/29 24.1% 2/18 11.1% 2/5  40.0% SASH3 98 7/33 21.2% 11/40  27.5% 4/25 16.0% HNRNPH3 69 2/34 5.88% 0/18 0.00% 6/17 35.29%

TABLE 6 Group/ No. of No. of Sensi- Speci- Diagnosis Biomarker (s) Patients Positive tivity ficity Group 1/ BAMBI 24 4 N/R 83.3 Colonoscopy Negative Group 2/ BAMBI 12 0  0.0 N/R Advanced polyp Group 3/ BAMBI 23 9 39.1 N/R Cancer (stages I-III) Group 1/ HNRNPH3 24 1 N/R 95.8 Colonoscopy Negative Group 2/ HNRNPH3 12 0  0.0 N/R Advanced polyp Group 3/ HNRNPH3 23 7 30.4 N/R Cancer (stages I-III) Group 1/ BAMBI + 24 5 N/R 79.2 Colonoscopy HNRNPH3 Negative Group 2/ BAMBI + 12 0  0.0 N/R Advanced polyp HNRNPH3 Group 3/ BAMBI + 23 13 56.5 N/R Cancer HNRNPH3 (stages I-III)

TABLE 7 No. of No. of Sensi- Speci- Group¹ Biomarker (s) Patients Positive tivity ficity Group 1 CHD2 26 2 N/R 92.3 Group 2 CHD2 17 1  5.9 N/R Group 3 CHD2 24 12 50.0 N/R Group 1 EPAS1 26 2 N/R 92.3 Group 2 EPAS1 17 0  0.0 N/R Group 3 EPAS1 24 3 12.5 N/R Group 1 CHD2 + EPAS1 26 1 N/R 92.3 Group 2 CHD2 + EPAS1 17 1  5.9 N/R Group 3 CHD2 + EPAS1 24 14 58.3 N/R ¹Groups are assigned to diagnosis as in Table 6.

TABLE 8 No. of No. of Sensi- Speci- Group² Biomarker (s) Patients Positive tivity ficity Group 1 BAMBI 23 3 N/R 87.0 Group 2 BAMBI 10 0  0.0 N/R Group 3 BAMBI 21 9 42.9 N/R Group 1 HNRNPH3 23 1 N/R 95.7 Group 2 HNRNPH3 10 0  0.0 N/R Group 3 HNRNPH3 21 7 33.3 N/R Group 1 CHD2 23 1 N/R 95.7 Group 2 CHD2 10 0  0.0 N/R Group 3 CHD2 21 7 33.3 N/R Group 1 BAMBI + 23 5 N/R 78.3 HNRNHP3 + CHD2 Group 2 BAMBI + 10 0  0.0 N/R HNRNHP3 + CHD2 Group 3 BAMBI + 21 14 66.7 N/R HNRNHP3 + CHD2 ²Groups are assigned to diagnosis as in Table 6.

In Table 9 it is shown that combinatorial data analysis increases the sensitivity of identification of four biomarkers, CHD2 (SEQ ID NO: 1), EPAS1 (SEQ ID NO: 6), HNRNPH3 (SEQ ID NO: 4) and KIAA1199 (SEQ ID NO: 13) in comparison to the sensitivity of each of the biomarkers.

TABLE 9 No. of No. of Sensi- Speci- Group³ Biomarker (s) Patients Positive tivity ficity Group 1 CHD2 13 1 N/R 92.3 Group 2 CHD2 10 0  0.0 N/R Group 3 CHD2 13 9 69.2 N/R Group 1 EPAS1 13 0 N/R 100.0  Group 2 EPAS1 10 0  0.0 N/R Group 3 EPAS1 13 4 30.8 N/R Group 1 HNRNPH3 13 0 N/R 100.0  Group 2 HNRNPH3 10 0  0.0 N/R Group 3 HNRNPH3 13 4 30.8 N/R Group 1 KIAA1199 13 1 N/R 92.3 Group 2 KIAA1199 10 1 10.0 N/R Group 3 KIAA1199 13 9 69.2 N/R Group 1 CHD2 + 13 2 N/R 84.6 EPAS1 + HNRNPH3 + KIAA1199 Group 2 CHD2 + 10 1 10.0 N/R EPAS1 + HNRNPH3 + KIAA1199 Group 3 CHD2 + 13 13 100.0  N/R EPAS1 + HNRNPH3 + KIAA1199 ³Groups are assigned to diagnosis as in Table 6.

In another analytic approach, two datasets of qPCR delta Ct results have been defined, Cancer-Healthy and AD-Healthy. Relationship between genes, as well as dispersion measures of genes among case-healthy groups, were calculated.

In the Cancer-Healthy dataset the correlation between the eight genes listed in Table 3 revealed two clusters of genes that were highly correlated to each other. Cluster 1 includes the genes CHD2, BAD and BAMBI (SEQ ID NOs: 1-3, respectively) and Cluster 2 includes the genes NEK6, FKBP5 and SASH3 (SEQ ID NOs: 5, 7 and 17, respectively). According to these findings the following features were generated:

-   -   1. Max_BAD_BAMBI_CHD2—this feature corresponds to the maximum         value from the three genes CHD2, BAD and BAMBI (SEQ ID NOs: 1-3,         respectively);     -   2. Max_FKBP5_SASH3_NEK6—this feature corresponds to the maximum         value from the three genes NEK6, FKBP5 and SASH3 (SEQ ID NOs: 5,         7 and 17, respectively).

Logistic regression was used to develop a classification model for Cancer-Healthy using four features:

-   -   a) Max_BAD_BAMBI_CHD2;     -   b) Max_FKBP5_SASH3_NEK6;     -   c) EPAS1; and     -   d) KLF9.

The analysis resulted with the following model equation:

Y˜max_BAD_BAMBI_CHD2+5×max_FKBP5_NEK6_SASH3+23×EPAS1−3×KLF9−25.

Receiving operating characteristic (ROC) curve analysis was used to evaluate the separation capability of the model (FIG. 5) and yield (84.3% AUC, 95% Asymptotic CI: 74.8%-93.9%, P value<0.001). The specificity above 85% point and the maximum Youden index point (sensitivity+specificity−1) met at a point 0.84 with performance sensitivity of 75% and specificity of 93% (FIG. 6).

The case processing summary is provided in Table 10:

Label Valid N (listwise) Positive^(a) 28 Negative^(b) 41 Missing 27 ^(a)subject for which gene result was positive, under the nonparametric assumption ^(b)subject for which gene result was negative (null hypothesis: true area = 0.5) c - subjects for which results were missing

For the Healthy-AD database t-test and/or stepwise-regression model were used to select the features that participated in model building. BAD and NEK6 (SEQ ID NOs: 2 and 5, respectively) were selected and the equation for this model was as follows:

Y˜BAD+11×NEK6-48

ROC analysis was used to evaluate the separation capability of the model (FIG. 7) on Healthy-AD and yielded 70.5% AUC (95% Asymptotic CI: 58.5%-82.5%, P value<0.001). The specificity above 85% point and the maximum Youden index point meet at a point 2 with performance sensitivity of 60% and specificity of 87% (FIG. 8).

The case processing summary is provided in Table 11:

Label Valid N (listwise) Positive^(a) 38 Negative^(b) 46 Missing 24 ^(a)subject for which gene result was positive, under the nonparametric assumption ^(b)subject for which gene result was negative (null hypothesis: true area = 0.5) c - subjects for which results were missing

These analyses strongly demonstrated that although purified plasma RNA is not of good quality it is still possible to identify genes with relevance the detection of advanced adenoma or colorectal carcinoma.

The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without undue experimentation and without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. The means, materials, and steps for carrying out various disclosed functions may take a variety of alternative forms without departing from the invention. 

1. A method for treating colorectal cancer or precancerous advanced colorectal polyps in a subject, the method comprising: (a) obtaining sample mRNA from the subject; (b) detecting overexpression of a biomarker comprising at least one nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-17; and (c) treating the subject with overexpression of the at least one nucleic acid sequence, wherein the treatment comprises at least one of administering a chemotherapeutic agent, performing bowel resection, applying radiation therapy, and a combination thereof.
 2. The method of claim 1, wherein said biomarker comprises a plurality of nucleic acid sequences selected from the group consisting of SEQ ID NO: 1-17.
 3. The method of claim 2, wherein said biomarker comprises at least three nucleic acid sequences selected from the group consisting of SEQ ID NO: 1-17.
 4. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 1 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 2-17.
 5. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 2 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1 and 3-17.
 6. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 3 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1, 2 and 4-17.
 7. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 4 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-3 and 5-17.
 8. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 5 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-4 and 6-17.
 9. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 6 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-5 and 7-17.
 10. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 7 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-6 and 8-17.
 11. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 8 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-7 and 9-17.
 12. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 9 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-8 and 10-17.
 13. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 10 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-9 and 11-17.
 14. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 11 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-10 and 12-17.
 15. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 12 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-11 and 13-17.
 16. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 13 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-12 and 14-17.
 17. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 14 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-13 and 15-17.
 18. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 15 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-14 and 16-17.
 19. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 16 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-15 and
 17. 20. The method of claim 1, wherein said biomarker comprises SEQ ID NO: 17 and at least one additional nucleic acid sequence selected from the group consisting of SEQ ID NO: 1-16. 